-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom identities #4764
Custom identities #4764
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Ignored Deployment
|
Passing run #7150 ↗︎
Details:
Review all test suite changes for PR #4764 ↗︎ |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4764 +/- ##
==========================================
- Coverage 86.63% 86.61% -0.02%
==========================================
Files 339 339
Lines 20008 20078 +70
Branches 2556 2583 +27
==========================================
+ Hits 17333 17391 +58
- Misses 2206 2215 +9
- Partials 469 472 +3 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@galvana really nice work making this comprehensive update, and thank you for anticipating all the impacts this change may have, including with relaxing the constraint to allow >1 identity value!
most of my comments are relatively minor tweaks, and i'm able to generally follow the identity-management updates well - i think your approach for custom identity support there is clever and it seems pretty robust (a few things around the edges that may help to make it a bit more defensive/less prone to accidental error moving forward)!
in terms of the graph_task
updates, i'm following the concrete functionality you've added, i just generally always have a bit of trouble wrapping my head around what pre_process_input_data
does generally - it's a pretty weighty method! so i'm feeling a bit less confident in my analysis of those changes. nothing particularly stands out to me as problematic, but it may be good to sync up to understand a bit more concretely how those updates will play in with the overall workflow...
as mentioned above, UAT testing is looking good! so i'm about ready to approve this, perhaps you can just look over my comments and we can align on the graph_task updates, and then we should be good to push this through 👍
src/fides/api/schemas/redis_cache.py
Outdated
|
||
from fides.api.custom_types import PhoneNumber | ||
from fides.api.schemas.base_class import FidesSchema | ||
|
||
MultiValue = Union[Union[StrictInt, StrictStr], List[Union[StrictInt, StrictStr]]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: i feel like there's a slightly more idiomatic way to do this - or at least it feels more readable to me. maybe you disagree, or i've got something wrong 😅
MultiValue = Union[Union[StrictInt, StrictStr], List[Union[StrictInt, StrictStr]]] | |
MultiValue = Union[StrictInt, StrictStr, List[Union[StrictInt, StrictStr]]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, I was struggling to get these types to work as expected so I completely looked past the double union 😆
def __init__(self, **data: Any): | ||
for field, value in data.items(): | ||
if field not in self.__fields__: | ||
if isinstance(value, LabeledIdentity): | ||
data[field] = value | ||
elif isinstance(value, dict) and "label" in value and "value" in value: | ||
data[field] = LabeledIdentity(**value) | ||
else: | ||
raise ValueError( | ||
f'Custom identity "{field}" must be an instance of LabeledIdentity ' | ||
'(e.g. {"label": "Field label", "value": "123"})' | ||
) | ||
super().__init__(**data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fancy!
nicely done to have this "validation" in the constructor to allow extra fields but still effectively constrain them 👍
def dict(self, *args: Any, **kwargs: Any) -> Dict[str, Any]: | ||
""" | ||
Returns a dictionary with LabeledIdentity values returned as simple values. | ||
""" | ||
d = super().dict(*args, **kwargs) | ||
for key, value in self.__dict__.items(): | ||
if isinstance(value, LabeledIdentity): | ||
d[key] = value.value | ||
else: | ||
d[key] = value | ||
return d |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fancy x2
{% for identity_type, identity_data in request.identity.items() %} | ||
<div>{{ identity_data.label }}:</div> | ||
<div>{{ identity_data.value }}</div> | ||
{% endfor %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, really comprehensive updates to get this working smoothly across the whole stack!
('CH-1', 'Jane Customer', 100, 'Cookie Rookie'), | ||
('CH-2', 'John Customer', 200, 'Cookie Connoisseur'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just thought it sounded funny 😆 I didn't know about the site
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hahaha i know i also thought it sounded great so i looked it up and was so happy with what i found
if isinstance(value, dict): | ||
label = value["label"] | ||
value = value["value"] | ||
else: | ||
label = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may be worth making this a bit more defensive, or at least adding in some code comments? i know with the current code this is safe, but i get a little bit concerned about a direct dict lookup like this buried pretty deep in the code causing a tough-to-debug/predict problem later on. i guess the risk would be if we ever evolved to have a proper field on the Identity
class with a dict
type. but maybe just adding in some extra checks on this side too, similar to as you've done in the Identity
constructor (i.e. and "label" in value and "value" in value:
)? even throwing a more specific runtime error there to alert developers could help prevent a tricky debug later on...
if isinstance(value, dict): | |
label = value["label"] | |
value = value["value"] | |
else: | |
label = None | |
if isinstance(value, dict): | |
if "label" in value and "value" in value: | |
label = value["label"] | |
value = value["value"] | |
else: | |
raise RuntimeError(f"Programming error: unexpected dict value '{value}' found in an Identity's `labeled_dict()`!") | |
else: | |
label = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion!
for key, value in privacy_request.get_persisted_identity() | ||
.labeled_dict(include_default_labels=True) | ||
.items() | ||
if value["value"] is not None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this not a bit risky? i may be missing something, a bit hard for me to determine the possible values here!
if value["value"] is not None | |
if value.get("value") is not None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same reasoning as the other ["value"]
access, this should be a dict with label
and value
keys and want to error if that's not the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be nice to get a bit of unit test coverage on these new functions specifically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added tests around the mutability functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lovely! thank you, those look great. they also work as a form of documenting the functionality :)
return output | ||
output[FIDESOPS_GROUPED_INPUTS].add(make_immutable(grouped_data)) | ||
|
||
return make_mutable(output) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok! this is a very helpful explanation. may be good to put a note about this functionality into the method docstring? (it's already a very explanatory docstring :) )
def make_immutable(obj: Any) -> Any: | ||
if isinstance(obj, dict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docstrings here and on make_mutable
would be nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added docstrings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I addressed most of your comments except the one about adding more tests to the dict
and labeled_dict
functions. I'd like to know what test cases you had in mind.
src/fides/api/schemas/redis_cache.py
Outdated
|
||
from fides.api.custom_types import PhoneNumber | ||
from fides.api.schemas.base_class import FidesSchema | ||
|
||
MultiValue = Union[Union[StrictInt, StrictStr], List[Union[StrictInt, StrictStr]]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, I was struggling to get these types to work as expected so I completely looked past the double union 😆
if isinstance(value, dict): | ||
label = value["label"] | ||
value = value["value"] | ||
else: | ||
label = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion!
schema = Identity() | ||
for field in self.provided_identities: # type: ignore[attr-defined] | ||
value = field.encrypted_value.get("value") | ||
if field.field_label: | ||
value = LabeledIdentity(label=field.field_label, value=value) | ||
setattr( | ||
schema, | ||
field.field_name.value, | ||
field.encrypted_value["value"], | ||
field.field_name, # type:ignore | ||
value, # type:ignore | ||
) | ||
return schema |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good approach 👍
for key, value in privacy_request.get_persisted_identity() | ||
.labeled_dict(include_default_labels=True) | ||
.items() | ||
if value["value"] is not None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same reasoning as the other ["value"]
access, this should be a dict with label
and value
keys and want to error if that's not the case.
return output | ||
output[FIDESOPS_GROUPED_INPUTS].add(make_immutable(grouped_data)) | ||
|
||
return make_mutable(output) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The output dictionary is constructed with deduplicated values for each key, ensuring that the value lists
and the fides_grouped_input list contain only unique elements.
def make_immutable(obj: Any) -> Any: | ||
if isinstance(obj, dict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added docstrings
('CH-1', 'Jane Customer', 100, 'Cookie Rookie'), | ||
('CH-2', 'John Customer', 200, 'Cookie Connoisseur'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just thought it sounded funny 😆 I didn't know about the site
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added tests around the mutability functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking great after the latest updates! discussed some potential further tweaks offline that may be nice, and also getting some FE code review, but this is looking good to go from my end 👍
// extract identity input values | ||
const identityInputValues = Object.fromEntries( | ||
Object.entries(action.identity_inputs ?? {}).map(([key, field]) => { | ||
const value = values[key] || null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we care about handling boolean and number values here? If values[key]
is falsy (including false
or 0
), this syntax will overwrite that with null. If this is just trying to fall back if the value doesn't exist, I would prefer using ??
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I added explicit checks for undefined
and ""
before falling back to null
.
Closes PROD-1806
Description Of Changes
This change adds the ability to use custom identities in the dataset identity references
Since the likelihood that a Fides instance will use multiple identities after this change (for example
email
+customer_id
), I also made a few changes in our graph utils to support multiple identities for SaaS connectors.Code Changes
Privacy Center
PrivacyRequestForm.tsx
to be able to render the new custom identitiesAdmin UI
Identity management
providedidentitytype
constraint onprovidedidentity.field_name
to allow any custom defined field nameIdentity
schema to allow extra fields as long as they have theLabeledIdentity
typecache_identity
/get_cached_identity_data
andpersist_identity
/get_persisted_identity
functions to support labeled identitiesTask execution
pre_process_input_data
ingraph_task.py
to return unique output values (see inline comments)Steps to Confirm
fides/data/sample_project/privacy_center/config/config.json
nox -s fides_env(test)
postgres_example_test_extended_dataset
in the DSR package that is written tofides_uploads
Pre-Merge Checklist
CHANGELOG.md